26 research outputs found
Reduced bootstrap for the median
In this paper we study a modified bootstrap that consists of only considering
those bootstrap samples satisfying k1 ≤ νn ≤ k2, for some 1 ≤ k1 ≤ k2 ≤ n,
where νn is the number of distinct original observations in the bootstrap sample. We call it reduced bootstrap, since it only uses a portion of the set of all possible bootstrap samples. We show that, under some conditions on k1 and k2, the reduced bootstrap consistently estimates the distribution and the variance of the sample median. Unlike the ordinary bootstrap, the reduced bootstrap variance estimator does not require conditions on the population generating the data to be a consistent estimator, but does rely an adequate choice of k1 and k2. Since several choices of k1 and k2 yield consistent estimators, we compare the finite sample performance of the corresponding estimators through a simulation study. The simulation study
also considers consistent variance estimators proposed by other authors.Ministerio de Educación y Cienci
Time series clustering for estimating particulate matter contributions and its use in quantifying impacts from deserts
Source apportionment studies use prior exploratory methods that are not purpose-oriented and receptor
modelling is based on chemical speciation, requiring costly, time-consuming analyses. Hidden Markov
Models (HMMs) are proposed as a routine, exploratory tool to estimate PM10 source contributions. These
models were used on annual time series (TS) data from 33 background sites in Spain and Portugal. HMMs
enable the creation of groups of PM10 TS observations with similar concentration values, defining the
pollutant's regimes of concentration. The results include estimations of source contributions from these
regimes, the probability of change among them and their contribution to annual average PM10 concentrations. The annual average Saharan PM10 contribution in the Canary Islands was estimated and
compared to other studies. A new procedure for quantifying the wind-blown desert contributions to
daily average PM10 concentrations from monitoring sites is proposed. This new procedure seems to
correct the net load estimation from deserts achieved with the most frequently used method
A Monte Carlo comparison of three consistent bootstrap procedures
Since bootstrap samples are simple random samples with replacement from the original sample, the information content of some bootstrap samples can be very low. To avoid this fact, some authors have proposed several variants of the classical bootstrap. In this paper we consider two of them: the
sequential or Poisson bootstrap and the reduced bootstrap. Both of them, like ordinary bootstrap, can yield second order accurate distribution estimators, that is, the three bootstrap procedures are asymptotically equivalent. The question that naturally arises is which of them should be used in a practical situation, in other words, which of them should be used for finite sample sizes. To try to answer this question, we have carried out a simulation study. Although no method was found to exhibit best performance in all the considered situations, some recommendations are given.Ministerio de Educación y Cienci
Caso prático: A análise dos problemas financeiros da criação de microempresas com a ajuda de máquinas de ve tores de suporte
Despite the leading role that micro-entrepreneurship plays in economic development,
and the high failure rate of microenterprise start-ups in their early years, very few studies have designed
financial distress models to detect the financial problems of micro-entrepreneurs. Moreover,
due to a lack of research, nothing is known about whether non-financial information and nonparametric
statistical techniques improve the predictive capacity of these models. Therefore, this
paper provides an innovative financial distress model specifically designed for microenterprise startups
via support vector machines (SVMs) that employs financial, non-financial, and macroeconomic
variables. Based on a sample of almost 5,500 micro-entrepreneurs from a Peruvian Microfinance
Institution (MFI), our findings show that the introduction of non-financial information related to the
zone in which the entrepreneurs live and situate their business, the duration of the MFI-entrepreneur
relationship, the number of loans granted by the MFI in the last year, the loan destination, and
the opinion of experts on the probability that microenterprise start-ups may experience financial
problems, significantly increases the accuracy performance of our financial distress model. Furthermore,
the results reveal that the models that use SVMs outperform those which employ traditional
logistic regression (LR) analysis.A pesar del destacado papel que desempeña el microemprendimiento
en el desarrollo económico y de la alta tasa de quiebra que tienen las nuevas
microempresas en sus primeros años de vida, muy pocos estudios han diseñado
un modelo para detectar las dificultades financieras de los microemprendedores.
Además, debido a la ausencia de investigaciones, no se conoce nada acerca de si
la información no financiera y las técnicas estadísticas no paramétricas mejoran
la capacidad predictiva de estos modelos. Por tanto, este artículo proporciona un
innovador modelo para detectar las dificultades financieras específicamente diseñado
para las microempresas de nueva creación mediante el uso de máquinas
de soporte vectorial (MSV ) y empleando variables financieras, no financieras y
macroeconómicas. Basados en una muestra de casi 5.500 de una Institución Microfinanciera
(IM F) peruana, nuestros hallazgos muestran que la introducción de
información no financiera relacionada con la zona en la que el emprendedor vive
y localiza su negocio, la duración de la relación IM F-emprendedor, el número de
préstamos concedidos por la IM F en el último año, el destino del préstamo y la
opinión de los expertos sobre la probabilidad de que la nueva microempresa experimente
problemas financieros, aumentan de manera significativa la precisión de
nuestro modelo de detección de dificultades financieras. Además, los resultados
revelan que los modelos construidos usando MVS superan los obtenidos por aquellos
modelos que emplean el tradicional análisis de regresión logística.Malgré le rôle important que joue le micro-entreprenariat dans le
développement économique, et le taux élevé d’échec des nouvelles micro-entreprises
dans leurs premières années d’existence, très peu d’études ont élaboré
un modèle pour détecter les difficultés financières des micro-entrepreneurs. De
plus, étant donné l’absence de travaux de recherche nous ne savons aucunement
si l’information non financière et les techniques non paramétriques améliorent
la capacité prédictive de ces modèles. Par conséquent, cet article propose un
modèle innovant pour détecter les détresses financières, spécialement conçu pour
les micro-entreprises qui viennent d’être créées par l’utilisation de machines à
vecteurs de support (MVS ) et en utilisant des variables financières, non financières
et macroéconomiques. Nous basant sur un échantillon de près de 5.500 microentrepreneurs
d’une Institution Micro-Financière (IM F) péruvienne, nos résultats
montrent que l’introduction d’informations non financières liées à la zone où
l’entrepreneur vit et situe son affaire, à la durée de la relation IMF–entrepreneur,
au nombre de prêts accordés par l’IM F au cours de la dernière année, à la destination
du prêt et l’avis des experts sur la probabilité que la nouvelle micro-entreprise
connaisse des problèmes financiers, augmentent de manière significative la précision
de notre modèle de détection de difficultés financières. De plus, les résultats
montrent que les modèles construits en utilisant des MVS dépassent ceux obtenus
par les modèles qui utilisent l’analyse traditionnelle de régression logistique.Apesar do destacado papel que o microempreendimento desempenha
no desenvolvimento econômico e da alta taxa de falências que as novas microempresas
têm nos seus primeiros anos de vida, poucos estudos têm projetado um
modelo para detectar as dificuldades financeiras dos microempreendedores. Além
disso, devido à ausência de pesquisas, não se sabe nada sobre se a informação
não financeira e as técnicas estatísticas não paramétricas melhoram a capacidade
preditiva destes modelos. Portanto, este artigo proporciona um inovador modelo
para detectar as dificuldades financeiras especificamente projetado para as microempresas
de criação recente mediante o uso de máquinas de vetores suportes
(MVS ) e utilizando variáveis financeiras, não financeiras e macroeconômicas.
Baseados em uma amostra de quase 5.500 microempresas de uma micro-instituição
financeira (IM F) peruana, encontramos que a introdução de informação
não financeira relacionada com a região na qual o empreendedor mora e localiza
o seu negócio, a duração da relação IM F- empreendedor, o número de empréstimos
concedidos pela IM F no último ano, a destinação do empréstimo e a opinião
dos peritos sobre a probabilidade de a nova microempresa ter problemas financeiros
aumentam significativamente a precisão do nosso modelo de detecção de
dificuldades financeiras. Além do mais, os resultados revelam que os modelos
construídos utilizando MVS ultrapassam os obtidos por aqueles modelos que utilizam
a tradicional análise de regressão logística
Modelling background air pollution exposure in urban environments: Implications for epidemiological research
Background pollution represents the lowest levels of ambient air pollution to which the population is
chronically exposed, but few studies have focused on thoroughly characterizing this regime. This study
uses clustering statistical techniques as a modelling approach to characterize this pollution regime while
deriving reliable information to be used as estimates of exposure in epidemiological studies. The background levels of four key pollutants in five urban areas of Andalusia (Spain) were characterized over an
11-year period (2005e2015) using four widely-known clustering methods. For each pollutant data set,
the first (lowest) cluster representative of the background regime was studied using finite mixture
models, agglomerative hierarchical clustering, hidden Markov models (hmm) and k-means. Clustering
method hmm outperforms the rest of the techniques used, providing important estimates of exposures
related to background pollution as its mean, acuteness and time incidence values in the ambient air for
all the air pollutants and sites studied
Review and Comparison of Intelligent Optimization Modelling Techniques for Energy Forecasting and Condition-Based Maintenance in PV Plants
Within the field of soft computing, intelligent optimization modelling techniques include
various major techniques in artificial intelligence. These techniques pretend to generate new business
knowledge transforming sets of "raw data" into business value. One of the principal applications of
these techniques is related to the design of predictive analytics for the improvement of advanced
CBM (condition-based maintenance) strategies and energy production forecasting. These advanced
techniques can be used to transform control system data, operational data and maintenance event data
to failure diagnostic and prognostic knowledge and, ultimately, to derive expected energy generation.
One of the systems where these techniques can be applied with massive potential impact are the
legacy monitoring systems existing in solar PV energy generation plants. These systems produce a
great amount of data over time, while at the same time they demand an important e ort in order to
increase their performance through the use of more accurate predictive analytics to reduce production
losses having a direct impact on ROI. How to choose the most suitable techniques to apply is one of
the problems to address. This paper presents a review and a comparative analysis of six intelligent
optimization modelling techniques, which have been applied on a PV plant case study, using the
energy production forecast as the decision variable. The methodology proposed not only pretends
to elicit the most accurate solution but also validates the results, in comparison with the di erent
outputs for the di erent techniques
e-Encuestas Probabilísticas II. Los Métodos de Muestreo Probabilístico
En este trabajo se aborda fundamentalmente el estudio de las encuestas
que utilizan la herramienta de Internet para su realización. En concreto su objetivo se centra en el planteamiento y desarrollo de diseños muestrales probabilísticos que permitan realizar encuestas desde la World Wide Web con el rigor necesario para poder inferir los resultados obtenidos a la población objeto de estudio, con determinada fiabilidad.In this work there is approached fundamentally the study of the surveys that use the tool of Internet for its accomplishment. We centres on the exposition and development of probabilistic sampling designs that allow to realize surveys from the World Wide Web with the necessary accuracy to be able to infer the results obtained to the population under study, with certain reliability
A new approach to influence analysis in linear models
propose a new approach to the study of influence in the General Linear Model based on conditional bias. This approach enables us to apply such an analysis to all particular cases of this model. The theoretical foundation, on which this approach is based, does not presuppose a particular hypothesis on the distribution of the variables. Applying the results obtained to the Multiple Linear Regression Model, measures of influence are obtained as already proposed by other authors. Finally we carry out an application of the results on the analysis of covariance
Modeling the Financial Distress of Microenterprise Start- Ups Using Support Vector Machines: A Case Study
Despite the leading role that micro-entrepreneurship plays in economic development, and the high failure rate of microenterprise start-ups in their early years, very few studies have designed financial distress models to detect the financial problems of micro-entrepreneurs. Moreover, due to a lack of research, nothing is known about whether non-financial information and non-parametric statistical techniques improve the predictive capacity of these models. Therefore, this paper provides an innovative financial distress model specifically designed for microenterprise startups via support vector machines (SVMs) that employs financial, non-financial, and macroeconomic variables. Based on a sample of almost 5,500 micro-entrepreneurs from a Peruvian Microfinance Institution (MFI), our findings show that the introduction of non-financial information related to the zone in which the entrepreneurs live and situate their business, the duration of the MFI-entrepre-neur relationship, the number of loans granted by the MFI in the last year, the loan destination, and the opinion of experts on the probability that microenterprise start-ups may experience financial problems, significantly increases the accuracy performance of our financial distress model. Furthermore, the results reveal that the models that use SVMs outperform those which employ traditional logistic regression (LR) analysis.A pesar del destacado papel que desempeña el microemprendimiento en el desarrollo económico y de la alta tasa de quiebra que tienen las nuevas microempresas en sus primeros años de vida, muy pocos estúdios han diseñado un modelo para detectar las dificultades financieras de los microemprendedores Además, debido a la ausencia de investigaciones, no se conoce nada acerca de si la información no financiera y las técnicas estadísticas no paramétricas mejoran la capacidad predictiva de estos modelos. Por tanto, este artículo proporciona un innovador modelo para detectar las dificultades financieras específicamente diseñado para las microempresas de nueva creación mediante el uso de máquinas de soporte vectorial (MSV) y empleando variables financieras, no financieras y macroeconómicas. Basados en una muestra de casi 5.500 de una Institución Mi-crofinanciera (IMF) peruana, nuestros hallazgos muestran que la introducción de información no financiera relacionada con la zona en la que el emprendedor vive y localiza su negocio, la duración de la relación IMF-emprendedor, el número de préstamos concedidos por la IMF en el último año, el destino del préstamo y la opinión de los expertos sobre la probabilidad de que la nueva microempresa experimente problemas financieros, aumentan de manera significativa la precisión de nuestro modelo de detección de dificultades financieras. Además, los resultados revelan que los modelos construidos usando MVS superan los obtenidos por aquellos modelos que emplean el tradicional análisis de regresión logística.Apesar do destacado papel que o microempreendimento desempenha no desenvolvimento económico e da alta taxa de falências que as novas microempresas têm nos seus primeiros anos de vida, poucos estudos têm projetado um modelo para detectar as dificuldades financeiras dos microempreendedores. Além disso, devido à ausência de pesquisas, não se sabe nada sobre se a informação não financeira e as técnicas estatísticas não paramétricas melhoram a capacidade preditiva destes modelos. Portanto, este artigo proporciona um inovador modelo para detectar as dificuldades financeiras especificamente projetado para as microempresas de criação recente mediante o uso de máquinas de vetores suportes (MVS) e utilizando variáveis financeiras, não financeiras e macroeconómicas Baseados em uma amostra de quase 5.500 microempresas de uma micro-insti-tuição financeira (IMF) peruana, encontramos que a introdução de informação não financeira relacionada com a região na qual o empreendedor mora e localiza o seu negócio, a duração da relação IMF- empreendedor, o número de empréstimos concedidos pela IMF no último ano, a destinação do empréstimo e a opinião dos peritos sobre a probabilidade de a nova microempresa ter problemas financeiros aumentam significativamente a precisão do nosso modelo de detecção de dificuldades financeiras. Além do mais, os resultados revelam que os modelos construídos utilizando MVS ultrapassam os obtidos por aqueles modelos que utilizam a tradicional análise de regressão logística